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Abstract 

Neural networks with synaptic weights constructed according to the 
weighted Hebb rule, a variant of the familiar Hebb rule, are studied in the 
presence of noise(finite temperature), when the number of stored patterns is 
finite and in the limit that the number of neurons N ^ oo. The fact that 
different patterns enter the synaptic rule with different weights changes the 
configuration of the free energy surface. For a general choice of weights not all 
of the patterns are stored as global minima of the free energy function. How- 
ever, as for the case of the usual Hebb rule, there exists a temperature range 
in which only the stored patterns are minima of the free energy. In particular, 
in the presence of a single extra pattern stored with an appropriate weight in 
the synaptic rule, the temperature at which the spurious minima of the free 
energy are eliminated is significantly lower than for a similar network without 
this extra pattern. The convergence time of the network, together with the 
overlaps of the equilibria of the network with the stored patterns, can thereby 
be improved considerably. 
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1 Introduction 



The statistical mechanics of large neural networks with the Hebb rule prescription 
for the synaptic weights has been studied in detail and is now well-understood [1,2]. 
In this paper, we shall study the statistical mechanics of neural nets with synaptic 
weights which are constructed according to the weighted Hebb rule. For orthogonal 
patterns, the Hebb rule indeed stores the required patterns as fixed points of the 
deterministic updating dynamics, as is well known. The role of the weighted Hebb 
rule in the storage of non-orthogonal patterns was examined in ref . [3] . The weighted 
Hebb rule is also a natural choice when one wants to store patterns or classes of 
patterns with varying radii of attraction. Although the precise dependence of the 
radii of attraction on the weights with which different patterns enter the synaptic 
rule is difficult to study, it is clear that this rule offers the possibility of adjusting the 
radii of attraction. 

Our principal motivation for studying the weighted Hebb rule arises from the 
expectation that the presence of different weights for different patterns would affect 
the configuration of the free energy surface. There is the possibility that some of the 
degeneracy of the minima of the free energy would be lifted; in addition, the range of 
useful operating temperatures of the network would be changed. We shall find that 
in fact, the critical operating temperature of the network can be suitably lowered 
by a judicious choice of weights. The time needed for the network to converge to 
useful equilibrium states can be thereby reduced, since lower noise levels mean faster 
convergence times. Additionally, at lower temperatures, the overlaps of the network 
equilibria with the stored memories are larger; the overall quality of memory recall 
of the network can thus be significantly enhanced. 

In the next section, we present the evaluation of the free energy along the lines 
of ref.[l]. The stationary-point conditions yield the mean field equations (MFE's) for 
equilibrium states in the large- A^" limit. In section 3, we derive the stability conditions 
of the solutions to the mean field equations. In section 4, various critical temperatures 
for the existence of stable equilibria are calculated. In the last section we summarise 
our conclusions. 
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2 The weighted Hebb rule 



We start with a network of N neurons with states Si{t) — ±1 at time t. At time t + 1, 
the probabihty of Si flipping sign is 

W{si ^ -Si) = (1 + exp(2/3si/ii))-\ 

where 

N 

hi = JijSjit) 

is the local field or potential at neuron i due to all the other neurons, and (5 is an 
inverse noise parameter(equivalently T = is a 'temperature' parameter). 

Since Wij is symmetric, an energy function 

can be attached to every configuration [s] . At zero temperature (/3 
converges to a local minimum of this energy function. 

Given a set of p patterns (finite in number) erf, = 1, to be stored, we could 
try to store these patterns by constructing the synaptic weights Jjj in the form 

where the numbers g^^ are positive. Without the g^^ factor this would be nothing 
but the usual Hebb rule. However,with the g^y^ if we require the patterns o to be 
stored as fixed points in the ^ oo limit, the statistically significant contributions 
come only from the g^^p, terms in Jjj. We shall therefore retain only these diagonal 
terms g^^^ = g^ in the synaptic rule, yielding the weighted Hebb rule: 

n 

At finite temperatures T, one needs to look for minima of the free energy to iden- 
tify metastable states of the network. Accordingly, we need to evaluate the partition 
function 

Z = Tre-^^. 



oo), the network 
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Proceeding as in ref.[l], and assuming that the stored patterns are random, we define 
overlap variables =^ cr^ < s >^. In the — ^ oo limit, the free energy per 
neuron f = F/N and the stationary point equations we get in the evaluation of Z 
then take the form 

fW) = ^ E -< - ^ « log(2 cosh(3m.a) » , (1) 

and 

"T'/i = S'/i < cr'^tanhPm.a > (2) 

respectively. 

We shall first look for solutions to the stationary point or mean field equations 
at zero temperature. 

r ^ : 

In this limit, the tanh function becomes a sign function, and log(2coshy) — > \y\ 
as \y\ — > oo. 

If fh has only one non-zero component (the Mattis states), for instance rh = 
(m, 0...0), then m — gi, (up to an irrelevant sign), and / = —{l/2)gi. Of course, for a 
choice of m = (0, m, 0...0) one would get m — g2 and / = — (l/2)g'2, etc., but without 
loss of generality we will confine ourselves to only the former ansatz. Prom this we 
can see that the lowest-energy state, and hence a stable state, is 

m = ±5'max(l,0...0), 

with gmax being the largest component of g. The stability of these states can also 
be seen from the MPE and the free energy directly: we have / = —(1/2) ^{^/ g^)Tn^^ 
and I](l/5'^)m^ < gmax (this follows from <^\m - a [l2{Tn'^YY^'^)-i implying that 
these Mattis states occupy global minima. 

We note, however, that the other Mattis states, corresponding to < gmax, 
are not global minima. Nevertheless, they are certainly local minima (and hence 
metastable states) at zero temperature, and they exist as stable states for sufficiently 
low temperatures as well, as we shall see. 

Any state which is not a Mattis state will be called a spurious state, as is usual; 
the Mattis states are the ones desirable for associative memory purposes. 
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For symmetric states with n non-zero components of the type m — 
m„(l...l, 0...0), the MFE's imply that one must require, 



91^92^ ■■■ ^ 9n = 9 



in which case we get 



rrin = ±- -cl Zn |> 



n 



and 




We shall call such states symmetric states corresponding to 9. 

These equations differ from those obtained in [1] only in the factor of g appearing 
on the right-hand side, resulting in the same ordering of the /„'s as that of [1]. 

We can also consider general states of the form m = (mi, m2, m^, m„, 0, 0), 
with nonzero m's. However, to reduce technical complexity wc restrict ourselves to 
the case of n = 3. Since there arc no non-trivial solutions with n = 2, as wc shall 
sec, this will be sufficient to establish a definitive conclusion. It is easy to see that 
the T = limit of these states is m = {l/2){gi, g2, g^, 0, 0), and their stability will 
be discussed in the next section. 



We shall assume that the states we wish to look at start appearing just below 
a temperature ^(which depends on the state); correspondingly, the overlaps m are 
small near this temperature, and we can expand the cosh and tanh functions in a 
series in m, keeping only the first few terms. Then our equations become (to the 
appropriate order) 



We can see that there exists a critical temperature, above which the only solution 
is the trivial one m — 0. For example, for the Mattis state m — (m, 0...0), the solution 
is 



0: 




(3) 



and 



m = It. ^rni- 1(^1:^1+ « i^-^r » -t log 2. (4) 



m^{9i)^?>9i{9i-T) 
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where, again, the subscript "1" can be elevated to a /i, since a general Mattis state 
can have its non-zero component in any slot. Also the associated free energy is 

/(/3) + riog2 = -A(^^_rf. 

The critical temperature for the appearance of a Mattis state with one non-zero 
component m'^ is therefore = g^. For all of the Mattis states to exist as solutions 
of the stationary point equations, therefore, the operating temperature of the network 
must satisfy T < Qs where Qs is the smallest of the weights g^. 

The symmetric states rh = m„(l...l, 0, ...0), still require that we have g — 
{g, ...g, gn+i, gp), with n g^s, and to be a solution m„ must be 

The corresponding free energies take the form 

/ + Tlog2 = - — ^^(g - Tf 

so that for given g, the n — 1 state has the least free energy among the symmetric 
states corresponding to that g. In the next section we will see that these are in fact 
unstable above a certain critical temperature, and so we postpone the discussion of 
stabihty to that section. 

For the general asymmetric states, having restricted our attention to the n — ?> 
case (i.e. m = (mi, m2, m^, 0, ...0)), we shall show that these states are also unstable 
at T = g^. The stabihty of these states will be discussed at length in the next section. 



3 Stability 

The positivity of the eigenvalues of the stability matrix d'^ f / duif^dmi, assures the 
stability of the states. From (2) we get 

9V 1 



dm^dm^, g/^ 
where 
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Zero temperature: 



As we discussed in the previous section the Mattis states are stable at T = for 
9 = {9max, 92:--:9p): with m = {Qmax: 0, 0) being the global minimum. 

For the symmetric states with n non-zero compo- 
nents m = 777.^(1, 1, .., 1, 0, 0, ...0), with g = {g, g, go), and a — n + l,...,p, we 
find the eigenvalues of the stability matrix to be 

Ai = i-/3(l-g„) + /3(n-l)g 
A(") = L-p(l-q^) (5) 

9a 

A3 = --(3{l-q„)-PQ 
9 

where g„ = Q^^. 

As in ref. [1], in the T ^ limit, the parameter q stays finite for even n, and goes 
to unity exponentially in f3 for odd n, while Q goes to zero exponentially. Therefore 
the eigenvalues are all positive for the odd n states, while the even n states are all 
unstable due to the presence of negative eigenvalues. 

In the case n = 3, for instance, and in the limit T = 0, we see that Qn — i and 
(5 = 0, giving Ai = 1/g, X2 — i/9a, and X3 — 1/g, all of which are positive, yielding 
stability. 

Similarly, for T = and for asymmetric states, all the p eigenvalues reduce to 
their respective l/g^, again yielding stability. 

Finite temperatures: 

For the symmetric states, at the temperature T ~ g^, we have q ~ and 
Q — {2q/n). Then we can see that 

X^^l-(3{l-q)-(3Q^^-^-^{T-g) 

g Sn — 2 g'^ 

which is clearly negative for T < g, except for n = 1 where A = —{gn — T) > 0. The 
{n > 1) symmetric states are therefore unstable at T = g. 

Let us mention in passing that Ai = ^ — (3(1 — q) + (5{n — 1)Q becomes, for 77 = 1 
states, which is positive below the temperature T — g. The eigenvalue A3 
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is not present for n = 1 states. The sign of A2 depends explicitly on the various 
components of g, and this shall be discussed further, below. However, that A3 is 
negative is sufficient to render the symmetric states with n > 1 unstable. The exact 
temperature at which their stability, as well as the stabihty of the asymmetric states, 
is lost will be derived in the next section. 



4 Critical Temperatures 

First, we deal with the symmetric states. We showed that of the eigenvalues (5) of 
the stability matrix, Ai is positive in the range T = to T ~ g', A3 changes sign 
from + to — , while the sign of A2 depends on the form of g explicitly (sec below). 
Hence, there are two possibilities to consider: one is where A3 is set to zero, to find 
the critical temperature T = Tc at which A3 changes sign, while A2 is constrained to 
be positive at that temperature T^. The second case is where A2 is set to zero to find 
the critical temperature T = T*, at which A2 changes sign, while A3 is constrained to 
be positive. The former gives 

Tc = g{l -Qn + Q) (6) 
and requiring A2 > at T = Tc gives the constraint 

g - l-Qn ^ ' 

where we recall g = {g, ...,g, g^), with n components equal to g and a — n + 1, ...,p. 
The MFE (2), when specialized to n = 3 symmetric states yields 

X — ^^rp^ ^ (tanh X + tanh 3a;) , (8) 

where x = j3ni^. With ^3 = (l/4)(tanh^ 3a; + 3tanh^a;) and Q = (l/4)(tanh^ 3a; — 
tanh^a;), solving (6) and (8) numerically for x and Tc/g^ we obtain 

X = 0.94, — = 0.46, for ^ < 1.32 
9 9 

with the last constraint coming from (7). 

Our results up to this point do not differ significantly from those of [1] . However, 
let us go on further to the second case with n — 1. 
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Let gs be the smallest of the g^s, and consider the corresponding Mattis state 
M = (0, 0, .., rris, 0, .., 0). The smallest of the eigenvalues in this case is A2 with 

9a 9 max J 

9max 

where gmax is the largest of the g^s, and qf is the corresponding value of q. Now to 
avoid spurious n = 3 states corresponding to gmax (which exist whenever gmax occurs 
at least three times in the set of (?'s), the operating temperature of the network must 
be greater than = OAGgmax, as we have seen. At this temperature, in order for M 
to exist as a stable state, must be positive, or at best zero. This gives the relation 

1 - < 0.46; 

together with the MFE 

— - = tanh/^m^, 
9s 

this yields the constraint 9s/ gmax > 0.589 on the value that the smallest g can take, 
if all the given patterns are to be stored as stable Mattis states of the network. 

Turning now to the case of the n — 3 symmetric states corresponding to g, and 
for ga/ g > 1.32, where g^ occurs only once or two times among the g^s, we see that 

^ = with ^>1.32 

9 9 9 
some of whose solutions can be tabulated as follows: 



9a/ 9 


1.32 


1.34 


1.42 


1.66 


2.0 


3.0 


X 


0.94 


0.96 


1.04 


1.21 


1.37 


1.69 


t:/9 


0.46 


0.45 


0.43 


0.38 


0.34 


0.29 



We can now see that, whereas for g^ < l-32g, the critical temperature is simply 
0.465^, for ga > l-32g, the critical temperatures are all lower than the former. If 
go is the largest g which occurs at least three times, the operating temperature of 
the network must be at least 0.46gio if the largest g bigger than go, gmax, satisfies 
9max < l-32(7o- This minimum necessary temperature for the avoidance of spurious 
equilibria is lowered when gmax > l-325'o- In other words, by adding additional 
patterns with sufficiently large weights, we can lower the temperature above which 
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there are no spurious states, leading to a "better" network. What is meant by "better" 
will be discussed in the next section. 

The symmetric states with n > 3 can be shown to have even lower critical 
temperatures, exactly as in [1]. Therefore, it is sufficient to consider the n — 3 states 
only. 

We now proceed to the case of the asymmetric states m — (mi, m2, rris, 0, 0) 
with a general weight vector, i.e. g — {gi, 92, Qz, Qa)- The MFE's can be written in 
the form 

X -\- y = -77777 — r [tanh(a; -\- y -\- z) -\- tanh x + tanh y — tanh z] 
X -\- z — - jit I \ [tanh(a; + y + z) + tanh x — tanh y + tanh z\ 
y + z — - T^y^ [tanh (x + y + — tanh X + tanh y + tanh 2;] 

where x = l3{rni + m2 — m^), y = P{mi — m2 + m^), z = (3{—mi + m2 + m^), and 
the secular equation, dictating stability, can be written as 

^ a=n+l 9a 



where 



h = 3(1 - g) - (T/^i) f 1 + + ^ 



{92/ Qi) (93/91), 
h = S{l-q)'-Ql-Ql-Ql 

- 2{T/9^){l-q)(l + ^^+ ^ ^ 



{92/91) {93/91), 



X92/91) {93/91) {92/ 9i){93/ 9i) ) 

lo = -{'2QiQ2Q3 + {i-q){Ql + Ql + Ql)-{'i--q)^) 

+ {T/9i)\l-q)(7A-^ + 7A-^+ ^ ^ 



{92/91) {93/91) {92/ 9i){93/ 9i) , 
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{92/gi){g3/gi) 



and 



Qi 


= Ql2 = 


-[tanh^(a; + y + z) + tanh^ x 


— tanh^ y — tanh^ z 


Q2 


= Ql3 = 


-[tanh^(x + y + z) — tanh^ x 


+ tanh^ y — tanh^ z 


Q3 


= Q23 = 


-[tanh^(a; + y + z) — tanh^ x 


— tanh^ 1/ + tanh^ z 






^[tanh^(x + y + z) + tanh^ x 


+ tanh^ y + tanh^ 2; 



Since we are generally interested in the temperature Tc at which a given eigen- 
value becomes zero (i.e. changes sign from + to — ), there are two separate cases 
we can consider: one is where A*^"^ — - — — q) is set to zero, while the other 
3 eigenvalues (from the cubic part) are constrained to be nonnegative. The second 
choice is to set one of the 3 eigenvahies from the cubic part equal to zero and demand 
for A^"'' and the remaining 2 eigenvalues to be non-negative. 

The former case gives 

(Tjg,) = ^(1 - q) (9) 

gi 

and the positivity of the other eigenvalues can be insured by the constraints 

k < 0, li> 0, and Iq < 0. 
The first of these constraints, in conjunction with (9), simplifies to 

^>3fl + ^+ ' 



gi V {g2/gi) (53/51)/ 

The last two constraints, due to their dependance on the Q's and the q, must be 
imposed numerically in finding Tc. Some results are shown in the table below for the 
case when g2 — gs. These spurious states are stable for Qa > g'^*" and T <T*. 



52/51 


0.6 


0.8 


0.95 


1.0 


57751 


2.0 


1.89 


1.6 


1.32 


T:/gi 


0.18 


0.27 


0.37 


0.46 
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In the second case, since we are interested only in the zero eigenvalues, it is 
sufficient to set Iq = 0, and solve this equation numerically along with the MFE's. 
For the remaining eigenvalues to be nonnegative we must require 

k < 0, h> 0, and ^ < 

91 1 - ? 

Some results of this calculation are shown in the table below for gz — g2- 



92/91 


1.1 


1.2 


1.3 


1.4 


1.5 


TC/91 


0.29 


0.22 


0.19 


0.15 


0.11 



Again, these spurious states are stable for T < T^. 

We note that for g of the form {gi,g2, ga, ■■■), with gi = g2 = I and g^ = 1.32, the 
critical temperature of the associated spurious state (mi, m2, m^, 0.., 0) with mi = m2 
is close to 0.19. If gi occurs at least three times in g, the critical temperature of the 
n — 3 symmetric state corresponding to gi is 0.46. We can in fact make the general 
statement that if gmax is the largest component of g, and go the second largest, for 
9max/9o > 1-32, the critical temperature above which there are no spurious states is 
determined by demanding the instability of spurious states with non-zero entries mj 
of m corresponding to gi < go- 

A set of results for asymmetric states with g2 7^ ga are also given in the following 
table. 



92/91 


93/91 


TJgi 


0.93 


0.55 


0.12 


0.91 


0.82 


0.21 


0.9 


0.6 


0.13 


0.85 


0.7 


0.15 


0.8 


0.7 


0.13 


0.8 


0.6 


0.10 


0.75 


0.65 


0.10 


0.7 


0.66 


0.09 



We can also present our results in the following format that clarifies the behaviour of 
Tc/ gi for various values of g'2/5'1 and ga/gi- 
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93\ 92 
.91 V91 


0.8 


0.9 


1.0 


1.1 


1.2 


0.8 


.12 




.27 




.15 


0.9 




.25 


.33 


.23 




1.0 


.27 


.33 


.16 


.29 


.22 


1.1 




.23 


.29 


.37 




1.2 


.15 




.22 




.35 



The apparent symmetry of this table is simply due to the symmetry of the MFE's and 
the secular equation under the simultaneous exchange of 2 ^ 3 and x ^ y. It is now 
evident that all the critical temperatures we have obtained for the asymmetric states 
are smaller than 0.465fo (where Qq is the largest g that occurs atleast three times), as 
one moves away from the Hebbian case at the center of the table. 

5 Conclusion 

Our investigation of the use of the weighted Hebb rule in Hopfield networks has 
revealed that the structure of the minima of the free energy at finite temperatures 
can be quite distinct from the case of the usual Hebb rule. In particular, by choosing 

the weighting factors for the various patterns appropriately, spurious states can be 
destabilised at a significantly lower temperature compared to that for the usual Hebb 
rule. When the operating temperature of the network is larger than the largest among 
the critical temperatures for the various spurious states, we have a network where only 
the Mattis states (corresponding to the stored patterns) are equilibria of the network. 

Specifically, we can make the following rather general statements. 

(1) If the largest of the ^''s, Qmax, occurs at least three times or more, then the 
temperature range in which no spurious states exist is QAQg^nax < T < Qmax- If the 
largest g which occurs at least three times is g^, and the largest g, gmax, occurs no 
more than two times and satisfies gmax > l-32(7o, then the critical temperature above 
which no stable spurious states exist is smaller than 0.465fO) and can be calculated as 
we have shown. 

(2) If the smallest of the g^s is gmin, a-nd the largest one, gmax, occurs at least 
three times, and the constraint gmin/gmax > 0.589, is satisfied, aU of the patterns to 
be stored exist as stable Mattis states in the range of temperatures where spurious 
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states are excluded. If gmax occurs only once or twice, this constraint on the ratio of 
gmin to Qmax IS changed and can be calculated in a manner analogous to that shown 
in section 4. 

One consequence of the lowering of the useful operating temperature is that con- 
vergence of the network to metastable states would be faster. A second consequence 
is that the overlaps of the equilibria of the network with the stored patterns would 
be larger due to the reduced temperature. Given a set of patterns to be stored, one 
could then simply put in an extra pattern weighted by a sufficiently larger weight g 
as compared to the g^'s of the other patterns to construct the synapses. The result- 
ing network would then converge to an equilibrium state closer to one of the stored 
patterns and at a faster rate than a network constructed without this extra pattern 
being taken into account. It would be interesting to carry out detailed simulations 
of networks employing the weighted Hebb rule and to determine the relative sizes of 
the basins of attraction for the different stored patterns. 
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